AITopics | rhetorical role

Collaborating Authors

rhetorical role

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Structured Definitions and Segmentations for Legal Reasoning in LLMs: A Study on Indian Legal Data

Khatri, Mann, Yusuf, Mirza, Shah, Rajiv Ratn, Kumaraguru, Ponnurangam

arXiv.org Artificial IntelligenceNov-27-2025

Large Language Models (LLMs), trained on extensive datasets from the web, exhibit remarkable general reasoning skills. Despite this, they often struggle in specialized areas like law, mainly because they lack domain-specific pretraining. The legal field presents unique challenges, as legal documents are generally long and intricate, making it hard for models to process the full text efficiently. Previous studies have examined in-context approaches to address the knowledge gap, boosting model performance in new domains without full domain alignment. In our paper, we analyze model behavior on legal tasks by conducting experiments in three areas: (i) reorganizing documents based on rhetorical roles to assess how structured information affects long context processing and model decisions, (ii) defining rhetorical roles to familiarize the model with legal terminology, and (iii) emulating the step-by-step reasoning of courts regarding rhetorical roles to enhance model reasoning. These experiments are conducted in a zero-shot setting across three Indian legal judgment prediction datasets. Our results reveal that organizing data or explaining key legal terms significantly boosts model performance, with a minimum increase of ~1.5% and a maximum improvement of 4.36% in F1 score compared to the baseline.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2511.20669

Country:

Europe (0.93)
North America > United States (0.46)

Genre: Research Report > New Finding (1.00)

Industry: Law (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

What Are the Facts? Automated Extraction of Court-Established Facts from Criminal-Court Opinions

Bendová, Klára, Knap, Tomáš, Černý, Jan, Pour, Vojtěch, Savelka, Jaromir, Kvapilíková, Ivana, Drápal, Jakub

arXiv.org Artificial IntelligenceNov-10-2025

Criminal justice administrative data contain only a limited amount of information about the committed offense. However, there is an unused source of extensive information in continental European courts' decisions: descriptions of criminal behaviors in verdicts by which offenders are found guilty. In this paper, we study the feasibility of extracting these descriptions from publicly available court decisions from Slovakia. We use two different approaches for retrieval: regular expressions and large language models (LLMs). Our baseline was a simple method employing regular expressions to identify typical words occurring before and after the description. The advanced regular expression approach further focused on "sparing" and its normalization (insertion of spaces between individual letters), typical for delineating the description. The LLM approach involved prompting the Gemini Flash 2.0 model to extract the descriptions using predefined instructions. Although the baseline identified descriptions in only 40.5% of verdicts, both methods significantly outperformed it, achieving 97% with advanced regular expressions and 98.75% with LLMs, and 99.5% when combined. Evaluation by law students showed that both advanced methods matched human annotations in about 90% of cases, compared to just 34.5% for the baseline. LLMs fully matched human-labeled descriptions in 91.75% of instances, and a combination of advanced regular expressions with LLMs reached 92%.

artificial intelligence, large language model, natural language, (19 more...)

arXiv.org Artificial Intelligence

2511.0532

Country:

North America > United States (0.46)
Europe > Czechia (0.29)

Genre: Research Report (1.00)

Industry: Law > Criminal Law (1.00)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Segment First, Retrieve Better: Realistic Legal Search via Rhetorical Role-Based Queries

Nigam, Shubham Kumar, Dubey, Tanmay, Shallum, Noel, Bhattacharya, Arnab

arXiv.org Artificial IntelligenceAug-4-2025

Legal precedent retrieval is a cornerstone of the common law system, governed by the principle of stare decisis, which demands consistency in judicial decisions. However, the growing complexity and volume of legal documents challenge traditional retrieval methods. TraceRetriever mirrors real-world legal search by operating with limited case information, extracting only rhetorically significant segments instead of requiring complete documents. Our pipeline integrates BM25, Vector Database, and Cross-Encoder models, combining initial results through Reciprocal Rank Fusion before final re-ranking. Rhetorical annotations are generated using a Hierarchical BiLSTM CRF classifier trained on Indian judgments. Evaluated on IL-PCR and COLIEE 2025 datasets, TraceRetriever addresses growing document volume challenges while aligning with practical search constraints, reliable and scalable foundation for precedent retrieval enhancing legal research when only partial case knowledge is available.

information retrieval, machine learning, natural language, (22 more...)

arXiv.org Artificial Intelligence

2508.00679

Country: Asia > India (0.47)

Genre: Research Report > New Finding (0.93)

Industry: Law (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Case-Based Reasoning (0.73)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.69)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

MARRO: Multi-headed Attention for Rhetorical Role Labeling in Legal Documents

Bambroo, Purbid, Adhikary, Subinay, Bhattacharya, Paheli, Chakraborty, Abhijnan, Ghosh, Saptarshi, Ghosh, Kripabandhu

arXiv.org Artificial IntelligenceMar-8-2025

Identification of rhetorical roles like facts, arguments, and final judgments is central to understanding a legal case document and can lend power to other downstream tasks like legal case summarization and judgment prediction. However, there are several challenges to this task. Legal documents are often unstructured and contain a specialized vocabulary, making it hard for conventional transformer models to understand them. Additionally, these documents run into several pages, which makes it difficult for neural models to capture the entire context at once. Lastly, there is a dearth of annotated legal documents to train deep learning models. Previous state-of-the-art approaches for this task have focused on using neural models like BiLSTM-CRF or have explored different embedding techniques to achieve decent results. While such techniques have shown that better embedding can result in improved model performance, not many models have focused on utilizing attention for learning better embeddings in sentences of a document. Additionally, it has been recently shown that advanced techniques like multi-task learning can help the models learn better representations, thereby improving performance. In this paper, we combine these two aspects by proposing a novel family of multi-task learning-based models for rhetorical role labeling, named MARRO, that uses transformer-inspired multi-headed attention. Using label shift as an auxiliary task, we show that models from the MARRO family achieve state-of-the-art results on two labeled datasets for rhetorical role labeling, from the Indian and UK Supreme Courts.

dataset, prediction, rhetorical role, (15 more...)

arXiv.org Artificial Intelligence

2503.10659

Country:

North America > United States > New York > New York County > New York City (0.04)
Asia > India > West Bengal > Kharagpur (0.04)
North America > United States > Utah > Salt Lake County > Salt Lake City (0.04)
(9 more...)

Genre:

Research Report > New Finding (0.67)
Research Report > Promising Solution (0.48)

Industry:

Law > Criminal Law (0.93)
Law Enforcement & Public Safety > Crime Prevention & Enforcement (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.94)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.93)

Add feedback

LegalSeg: Unlocking the Structure of Indian Legal Judgments Through Rhetorical Role Classification

Nigam, Shubham Kumar, Dubey, Tanmay, Sharma, Govind, Shallum, Noel, Ghosh, Kripabandhu, Bhattacharya, Arnab

arXiv.org Artificial IntelligenceFeb-9-2025

In this paper, we address the task of semantic segmentation of legal documents through rhetorical role classification, with a focus on Indian legal judgments. We introduce LegalSeg, the largest annotated dataset for this task, comprising over 7,000 documents and 1.4 million sentences, labeled with 7 rhetorical roles. To benchmark performance, we evaluate multiple state-of-the-art models, including Hierarchical BiLSTM-CRF, TransformerOverInLegalBERT (ToInLegalBERT), Graph Neural Networks (GNNs), and Role-Aware Transformers, alongside an exploratory RhetoricLLaMA, an instruction-tuned large language model. Our results demonstrate that models incorporating broader context, structural relationships, and sequential sentence information outperform those relying solely on sentence-level features. Additionally, we conducted experiments using surrounding context and predicted or actual labels of neighboring sentences to assess their impact on classification accuracy. Despite these advancements, challenges persist in distinguishing between closely related roles and addressing class imbalance. Our work underscores the potential of advanced techniques for improving legal document understanding and sets a strong foundation for future research in legal NLP.

large language model, machine learning, rhetorical role, (22 more...)

arXiv.org Artificial Intelligence

2502.05836

Country:

Europe > Italy (0.14)
Asia > India > Rajasthan (0.04)
Asia > India > West Bengal > Kolkata (0.04)
(7 more...)

Genre: Research Report > New Finding (0.86)

Industry:

Law (1.00)
Government > Regional Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.93)

Add feedback

HiCuLR: Hierarchical Curriculum Learning for Rhetorical Role Labeling of Legal Documents

Santosh, T. Y. S. S., Isaia, Apolline, Hong, Shiyu, Grabmair, Matthias

arXiv.org Artificial IntelligenceSep-27-2024

Rhetorical Role Labeling (RRL) of legal documents is pivotal for various downstream tasks such as summarization, semantic case search and argument mining. Existing approaches often overlook the varying difficulty levels inherent in legal document discourse styles and rhetorical roles. In this work, we propose HiCuLR, a hierarchical curriculum learning framework for RRL. It nests two curricula: Rhetorical Role-level Curriculum (RC) on the outer layer and Document-level Curriculum (DC) on the inner layer. DC categorizes documents based on their difficulty, utilizing metrics like deviation from a standard discourse structure and exposes the model to them in an easy-to-difficult fashion. RC progressively strengthens the model to discern coarse-to-fine-grained distinctions between rhetorical roles. Our experiments on four RRL datasets demonstrate the efficacy of HiCuLR, highlighting the complementary nature of DC and RC.

curriculum, dataset, rhetorical role, (14 more...)

arXiv.org Artificial Intelligence

2409.18647

Country:

Asia > India (0.04)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
Asia > China (0.04)

Genre: Research Report (0.64)

Industry: Law (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Artificial Intelligence (AI) in Legal Data Mining

Deroy, Aniket, Bailung, Naksatra Kumar, Ghosh, Kripabandhu, Ghosh, Saptarshi, Chakraborty, Abhijnan

arXiv.org Artificial IntelligenceMay-23-2024

Despite the availability of vast amounts of data, legal data is often unstructured, making it difficult even for law practitioners to ingest and comprehend the same. It is important to organise the legal information in a way that is useful for practitioners and downstream automation tasks. The word ontology was used by Greek philosophers to discuss concepts of existence, being, becoming and reality. Today, scientists use this term to describe the relation between concepts, data, and entities. A great example for a working ontology was developed by Dhani and Bhatt. This ontology deals with Indian court cases on intellectual property rights (IPR) The future of legal ontologies is likely to be handled by computer experts and legal experts alike.

artificial intelligence, data mining, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2405.14707

Country:

North America > United States (0.14)
Asia > India (0.04)
Europe > United Kingdom > England > Northamptonshire (0.04)
(2 more...)

Genre: Research Report (0.40)

Industry:

Information Technology > Security & Privacy (0.93)
Law > Litigation (0.89)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Ontologies (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Mind Your Neighbours: Leveraging Analogous Instances for Rhetorical Role Labeling for Legal Documents

Santosh, T. Y. S. S, Sarwat, Hassan, Abdou, Ahmed, Grabmair, Matthias

arXiv.org Artificial IntelligenceMar-31-2024

Rhetorical Role Labeling (RRL) of legal judgments is essential for various tasks, such as case summarization, semantic search and argument mining. However, it presents challenges such as inferring sentence roles from context, interrelated roles, limited annotated data, and label imbalance. This study introduces novel techniques to enhance RRL performance by leveraging knowledge from semantically similar instances (neighbours). We explore inference-based and training-based approaches, achieving remarkable improvements in challenging macro-F1 scores. For inference-based methods, we explore interpolation techniques that bolster label predictions without re-training. While in training-based methods, we integrate prototypical learning with our novel discourse-aware contrastive method that work directly on embedding spaces. Additionally, we assess the cross-domain applicability of our methods, demonstrating their effectiveness in transferring knowledge across diverse legal domains.

dataset, prototype, rhetorical role, (13 more...)

arXiv.org Artificial Intelligence

2404.01344

Country:

Asia > India (0.04)
North America > Canada (0.04)
Europe > Poland (0.04)
(3 more...)

Genre: Research Report (1.00)

Industry: Law (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.67)

Add feedback

Fact-based Court Judgment Prediction

Nigam, Shubham Kumar, Deroy, Aniket

arXiv.org Artificial IntelligenceNov-22-2023

This extended abstract extends the research presented in "ILDC for CJPE: Indian Legal Documents Corpus for Court Judgment Prediction and Explanation" \cite{malik-etal-2021-ildc}, focusing on fact-based judgment prediction within the context of Indian legal documents. We introduce two distinct problem variations: one based solely on facts, and another combining facts with rulings from lower courts (RLC). Our research aims to enhance early-phase case outcome prediction, offering significant benefits to legal professionals and the general public. The results, however, indicated a performance decline compared to the original ILDC for CJPE study, even after implementing various weightage schemes in our DELSumm algorithm. Additionally, using only facts for legal judgment prediction with different transformer models yielded results inferior to the state-of-the-art outcomes reported in the "ILDC for CJPE" study.

dataset, judgment prediction, prediction, (13 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3632754.3632765

2311.1335

Country:

North America > United States > New York > New York County > New York City (0.04)
Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
Asia > India > West Bengal > Kharagpur (0.04)
Asia > India > Uttar Pradesh > Kanpur (0.04)

Genre: Research Report (0.82)

Industry: Law (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.98)

Add feedback

Enhancing Pre-Trained Language Models with Sentence Position Embeddings for Rhetorical Roles Recognition in Legal Opinions

Belfathi, Anas, Hernandez, Nicolas, Monceaux, Laura

arXiv.org Artificial IntelligenceOct-8-2023

The legal domain is a vast and complex field that involves a considerable amount of text analysis, including laws, legal arguments, and legal opinions. Legal practitioners must analyze these texts to understand legal cases, research legal precedents, and prepare legal documents. The size of legal opinions continues to grow, making it increasingly challenging to develop a model that can accurately predict the rhetorical roles of legal opinions given their complexity and diversity. In this research paper, we propose a novel model architecture for automatically predicting rhetorical roles using pre-trained language models (PLMs) enhanced with knowledge of sentence position information within a document. Based on an annotated corpus from the LegalEval@SemEval2023 competition, we demonstrate that our approach requires fewer parameters, resulting in lower computational costs when compared to complex architectures employing a hierarchical model in a global-context, yet it achieves great performance. Moreover, we show that adding more attention to a hierarchical model based only on BERT in the local-context, along with incorporating sentence position information, enhances the results.

computational linguistic, information, representation, (16 more...)

arXiv.org Artificial Intelligence

2310.05276

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > France > Pays de la Loire > Loire-Atlantique > Nantes (0.05)
North America > United States > New York > New York County > New York City (0.04)
(9 more...)

Genre: Research Report > Promising Solution (0.34)

Industry: Law (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback